Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace MEM_MD5_DIGEST with generic MEM_16B_BUF #1874

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

yadij
Copy link
Contributor

@yadij yadij commented Jul 30, 2024

No description provided.

... with generic 16 byte buffer.
@rousskov rousskov changed the title Maintenance: Replace MEM_MD5_DIGEST memory pool Replace MEM_MD5_DIGEST with generic MEM_16B_BUF Jul 31, 2024
@@ -138,7 +138,7 @@ storeKeyPublicByRequestMethod(HttpRequest * request, const HttpRequestMethod& me
cache_key *
storeKeyDup(const cache_key * key)
{
cache_key *dup = (cache_key *)memAllocate(MEM_MD5_DIGEST);
cache_key *dup = (cache_key *)memAllocBuf(SQUID_MD5_DIGEST_LENGTH, nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEM_MD5_DIGEST pool should be removed, but it should not be replaced with another pool. Instead, cache_key should not be dynamically allocated at all! The key should become a cheap-to-create/copy/compare/destroy class based on something like two uint64_t integer data members. We may even have an old TODO about this long-awaited improvement somewhere...

I am not blocking this PR on these "wrong MEM_MD5_DIGEST replacement" grounds because there may be performance value in introducing a generic 16-byte pool (as discussed elsewhere).

src/mem/old_api.cc Show resolved Hide resolved
@@ -310,8 +315,6 @@ Mem::Init(void)
// TODO: Carefully stop zeroing these objects memory and drop the doZero parameter
memDataInit(MEM_DREAD_CTRL, "dread_ctrl", sizeof(dread_ctrl), 0, true);
memDataInit(MEM_DWRITE_Q, "dwrite_q", sizeof(dwrite_q), 0, true);
memDataInit(MEM_MD5_DIGEST, "MD5 digest", SQUID_MD5_DIGEST_LENGTH, 0, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introducing smaller generic pools may improve or harm performance. It may harm performance if we happen to have some frequently used context (e.g., in HTTP header parsing) where a short SBuf often grows from below 16 bytes to below 32 bytes -- introducing one extra memory allocation/copy. We should not make such performance-sensitive/focused changes without a pressing need and without performance testing!

I recommend keeping MEM_MD5_DIGEST pool until cache_key is upgraded (as discussed elsewhere). Instead, let's just set its doZero parameter to false so that we can make progress towards removing that pool parameter/functionality as described in the above TODO (line 310/315).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial testing indicates that almost all dynamic buffers start with >1KB allocations (as expected fro I/O buffers), even those which initialize with a 0-byte/missing buffer.

Largest use of these 16B buffers is Store MD5 objects as expected. Plus a few hundred global string constants (eg header names, constants) that are stored in SBuf or MemBuf and do not reallocate to anything larger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial testing indicates that almost all dynamic buffers start with >1KB allocations

Data from two busy production Squids suggests the opposite conclusion (and supports concerns behind this change request): Short strings (i.e. 40 bytes or shorter) are responsible for the vast majority of relevant allocations and exceed, say, 4KB buffer allocations by two orders of magnitude (e.g., 17,267,013 vs. 142,112 allocations).

Here is a sample from one worker showing dominance of small buffer allocations -- allocations that may be sensitive to this PR changes as detailed in this change request:

Pool Obj Size Allocated (#) Allocated (%)
MD5 digest 16 6,618 0%
Short Strings 40 17,267,013 90%
Medium Strings 128 736,958 4%
Long Strings 512 389,430 2%
1KB Strings 1024 2,376 0%
2K Buffer 2048 2,945 0%
4KB Strings 4096 142,112 1%
4K Buffer 4096 371 0%
8K Buffer 8192 151 0%
16KB Strings 16384 454,272 2%
16K Buffer 16384 98 0%
32K Buffer 32768 267 0%
64K Buffer 65536 87,965 0%

Please do not change buffers used for short strings in this PR. We can make progress without such risky and effectively untested performance-sensitive changes (as detailed in this change request).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is worth noting that the "Short Strings" stats you are looking at were merged into the "64B Buffer" pool (not even 32B pool) in current Squid where my analysis was done. Many of them also were String data-copies by header parsers. Those disappeared when SBuf started share a larger underlying MemBlob I/O buffer and/or reference to the RegisteredHeader global list.

Would the admin of that busy cache you have access to be willing to patch in a counter of how many times memAllocBuf() is called with size under 17 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is worth noting that the "Short Strings" stats you are looking at were merged into the "64B Buffer" pool (not even 32B pool) in current Squid where my analysis was done.

I am aware that master/v7 code has a different set of pools than v6 and earlier code (due to 2023 commit 250fd42, at least), but those differences do not affect this change request analysis AFAICT.

Many of them also were String data-copies by header parsers. Those disappeared when SBuf started share a larger underlying MemBlob I/O buffer and/or reference to the RegisteredHeader global list.

I do not have enough information to validate the implication that header parsing improvements were enough to make potential negative side effects of this PR changes negligible.

Would the admin of that busy cache you have access to be willing to patch in a counter of how many times memAllocBuf() is called with size under 17 ?

Patching those Squids to investigate this PR side effects is not an option right now. Fortunately, we can make very good progress (including merging this PR!) by focusing on changes leading to doZero parameter (and associated unwanted functionality) removal.

In fact, even if we had enough anecdotal evidence suggesting that current PR changes do not hurt performance, it would still be prudent to extract/isolate those performance-sensitive changes into a dedicated PR!

@rousskov rousskov added the S-waiting-for-author author action is expected (and usually required) label Jul 31, 2024
@yadij yadij added S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box and removed S-waiting-for-author author action is expected (and usually required) labels Dec 2, 2024
@yadij yadij requested a review from rousskov December 2, 2024 23:40
@yadij yadij added the S-waiting-for-QA QA team action is needed (and usually required) label Dec 2, 2024
@rousskov rousskov added S-waiting-for-author author action is expected (and usually required) and removed S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box labels Dec 4, 2024
@rousskov rousskov removed their request for review December 4, 2024 15:13
@rousskov
Copy link
Contributor

rousskov commented Dec 4, 2024

@yadij added the S-waiting-for-QA label

FWIW, I do not know why that label was added, but I recommend resolving PR merge conflicts as the next step towards all-green CI tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-for-author author action is expected (and usually required) S-waiting-for-QA QA team action is needed (and usually required)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants